Pupil Size Is Sensitive to Low-Level Stimulus Features, Independent of Arousal-Related Modulation

Visual Abstract


Significance Statement
Pupils respond characteristically to various modulating factors.Even when the overall luminance remains constant, pupil-size reflects changes in low-level stimuli and arousal.However, it is currently unclear how different factors modulating pupil size interact with each other.In this study, we show an interaction between contrast and spatial frequency on pupil size modulation while remaining independent of arousal effects.Our findings highlight the need to delineate task-related responses from stimulus-evoked responses present in stimulus trials.

Introduction
Pupil size is not fixed, but rather adjusts to luminance by the pupillary light reflex (PLR; Ellis, 1981).Pupil size also constricts transiently in response to the sudden appearance of stimuli, even when mean luminance remains constant, a phenomenon termed the pupil onset response (Barbur et al., 1992;Sahraie and Barbur, 1997;Barbur, 2004;Sahraie et al., 2013;Binda and Gamlin, 2017).It has further been shown that pupil responses to equiluminant stimuli depend on stimulus properties (Barbur and Thomson, 1987;Hu et al., 2019).
In addition to responding to visual stimulation, pupil size reflects cognitive processes.Covertly attending to a bright stimulus (Binda et al., 2013a;Mathôt et al., 2013;Binda and Murray, 2015a, b;Mathôt et al., 2016) or images that appear brighter evokes pupil constriction, even when the veridical brightness is constant (Laeng and Endestad, 2012;Binda et al., 2013b;Naber and Nakayama, 2013;Turi et al., 2018).Pupil size has also been shown to correlate with physiological measures of alertness such as heart rate and skin conductance (Kahneman et al., 1969;Wang et al., 2018;Roth et al., 2020), and is therefore commonly used as a proxy for arousal (Bradley et al., 2008;Vinck et al., 2015;Reimer et al., 2016;Roth et al., 2020).Performing a periodic cognitive task evokes a task-related pupil response, specifically phasic pupillary dilation, likely reflecting periodic changes in arousal (Roth et al., 2020).
Why is pupil size associated with such a diverse repertoire of physiological responses?What is the purpose of these responses, and what are the mechanisms underlying them?To answer these questions, it is necessary to characterize the different types of pupillary responses and determine whether and in what manner they interact with one another.Over the past decades physiology studies have identified several neural circuits underlying pupillary responses (Wang and Munoz, 2015;Joshi et al., 2016;Ebitz and Moore, 2017;Joshi and Gold, 2020).For example, the primary light-driven PLR parasympathetic pathway is comprised of multiple projections between retinal ganglion cells, midbrain pretectal olivary nucleus (PON), Edinger-Westphal nucleus (EWN), and ciliary ganglion to the sphincter pupillae muscle, which contracts, thereby constricting the pupil.The darkness-driven sympathetic pathway involves the Intermediolateral nucleus projecting to the superior cervical ganglion, which projects to the dilator pupillae muscle, which contracts, resulting in dilation of the pupil.Additionally, each of these two pathways likely involves not only contraction of the activated muscle, but also relaxation of the antagonistic muscle.
But these pathways do not map neatly onto pupillary responses.On the one hand, multiple neural circuits underlie similar pupillary responses, such as the light and darknessdriven PLR.On the other hand, individual circuits are involved in multiple responses.For example, in addition to conveying luminance information from the retina, the EWN receives inputs from the PON and superior colliculus (SC; Huerta and Harting, 1984;May et al., 2016).These modulatory inputs may reflect both stimulus properties such as contrast and spatial frequency, and cognitive factors such as attention (Wang and Munoz, 2015).Finally, the EWN also receives projections from the locus coeruleus (LC; Nieuwenhuis et al., 2011;Szabadi, 2018), which is prominently implicated in arousal (Aston-Jones and Cohen, 2005;Carter et al., 2010;Murphy et al., 2011;Joshi et al., 2016;de Gee et al., 2017;Breton-Provencher and Sur, 2019).Because of the diversity of neural circuits that modulate pupil responses, it is difficult to infer the precise driver of the pupil response in a given experimental context.
Here, we adopt an alternative approach, instead characterizing the functional properties of different pupillary responses, determining their interactions or independence.We asked whether cognitive and visual factors interact in their effect on pupil size, focusing on stimulus properties (contrast and spatial frequency) and arousal.We hypothesized that (1) the impact of stimulus properties on pupil size are tightly related with each other, and (2) arousal modulates pupil size independently from the effects of stimulus properties.While there is substantial indirect evidence in support of the first hypothesis (Barbur and Thomson, 1987;Young and Kennish, 1993;Hu et al., 2019), our goal here was to definitively test for interactions between spatial frequency and contrast.Moreover, the independence of stimulus and arousal effects on pupil size is often assumed, but has not yet been tested directly.To test these hypotheses, we measured pupil size while subjects performed a demanding rapid serial visual presentation (RSVP) task at fixation.Unattended gratings at varying contrast levels and spatial frequencies were presented in the periphery, and arousal was modulated by monetary reward level.

Observers
Participants (N ¼ 70, male: 28; female: 42) were healthy human adults with no known major neurologic disorders, and with normal or corrected-to-normal vision.A total of 14 observers in the experiment could not be used for pupil analysis for technical reasons (e.g., pupil responses not fully recorded).All participants were naive to the purposes of the experiment.Experiments were conducted with the written consent of each participant.The consent and experimental protocol followed the safety guidelines for MRI research and approval by the Institutional Review Board at National Institutes of Health (93-M-0170).

Stimuli
Stimuli were generated on Apple iMac using MATLAB (MathWorks) and MGL (Gardner et al., 2018), and were presented on a 61-inch screen (BenQ XL242OZ) positioned 57 cm in front of the participant.A portion of the participant group (five observers) were tested with a VIEWpixx/3D display (Vpixx Technologies).The stimuli consisted of a stream of central digits, and peripheral stimuli that appeared on 75% of trials.The peripheral stimuli were gratings at one of two possible contrast levels (high: 1 or low: 0.2 Michelson contrast) and at one of five possible spatial frequencies [0.5, 0.8, 1.3, 2, 3.2 cycles per degree (cpd)].Separately, 15 participants were tested with peripheral gratings at one of five possible contrast levels (0.2, 0.3, 0.4, 0.7, 1 Michelson contrast) and five spatial frequencies.The gratings extended from 1.3°eccentricity to the edge of the screen and had the same mean luminance as the gray background (103.60 cd/m 2 ).During each stimulus trial, four gratings appeared at random orientations, for 500 ms each.

Behavioral procedures
Participants performed an RSVP task at central fixation.Each trial lasted for 15 s.During the entire run numerical digits replaced each other rapidly at the center of the screen.The stimulus onset asynchrony (SOA) was determined by a one-up-two-down staircase in the initial practice run (Levitt, 1971).
The trial began with a precue, a cyan circle surrounding the digit for 500 ms.Participants were instructed to count the number of zeros that appeared on the screen until the response cue, which appeared at 500 ms before the end of 2 s from the precue, during which central numerals were presented throughout all trials and unattended peripheral gratings were also presented for the stimulus trials.After the response cue, participants had 2 s to respond with a button press (left ¼ two zeros, right ¼ other number of zeros) once indicated by the response cue.Participants maintained fixation on the presented numerals but did not begin counting until the start of next trial.The experiment consisted of 4-11 runs for each participant (four runs for 1 participant; 5 runs for 1 participant; 6 runs for 22 participants; 7 runs for four participants; 8 runs for 20 participants; 9 runs for four participants; 10 runs for 1 participant; 11 runs for three participants) with each run comprised of 17 trials.The interstimulus interval was 0 ms.In addition to the stream of numerals at fixation, unattended peripheral gratings were presented on 75% of the trials.On each stimulated trial, four gratings appeared, for 500 ms each, with zero Interstimulus interval (ISI).The first stimulus appeared in tangent with the precue, and the last stimulus disappeared with the response cue.The remaining 25% of the trials consisted of the RSVP task with a blank gray background.Mean luminance was kept constant during the entire run.In the practice run, participants received feedback immediately following their response, in the form of a green (correct) or red (incorrect) circle that appeared around the digit stream.Participants were instructed before each run whether it would be a high or low reward run, and were informed of their gains or losses at the end of each run.Participants were instructed to fixate the numerals continuously for the entire run.

Eye-tracking procedures
Pupil size and eye position were continuously recorded at a sampling frequency of 500 Hz using an EyeLink 1000 Plus (SR Research).

Eye data analysis Preprocessing
All data analysis were performed using MATLAB (MathWorks) and MGL (Gardner et al., 2018).Recorded raw time series were concatenated across all runs and zscored.For all other analysis regarding phasic pupil size, blinks were identified based on the standard criteria used by EyeLink software and removed together with three timepoints before and after each blink.Pupil size during blinks was replaced with linear interpolation between starting and endpoint of the blink.Occasional periods of missing data within the trial were interpolated in the same manner.Next, the interpolated time series were low pass filtered at 4 Hz, using third order Butterworth filters.Since filtering introduces a phase shift in the data, the time series were first lengthened with additional 200 points at the start and end of the time series.The extra values were linearly interpolated from the nearest 50 timepoints of the original time series.Then, the lengthened time series was reversed and run through the filter a second time with added timepoints trimmed after filtering.The pupil recordings of the first 7500 samples in each trial, which was the minimal trial length across all runs, were analyzed to combine data from all experiment versions.

Linear regression of task-related and stimulus-evoked responses
Preprocessed pupil time series were averaged across all runs based on the average and the SD of pupil diameter of the entire time series.We conceptualized phasic pupil responses as an output of a linear system of response amplitude multiplied by a canonical response unique to each individual.The response amplitude is thought to depend on arousal and changes in stimulus properties (Barbur and Thomson, 1987;Young and Kennish, 1993;Bradley et al., 2008;Vinck et al., 2015;Reimer et al., 2016;Hu et al., 2019;Roth et al., 2020).To quantify response amplitude, we modeled stimulus trial responses as a combination of task-related and stimulus-evoked responses and regressed against the mean response to obtain trial wise amplitude estimates.We first characterized the response template, a response profile constant over time.The amplitude of assumed inputs can then be estimated from observed pupil time series using linear regression, which can be written as where Y (7500 Â 1 per trial) is the measured pupil response time series, X (7500 Â 2 per trial) is the design matrix, and b (2 Â 1 per trial) is a vector of response amplitudes.The design matrix consists of a participant specific response template calculated by averaging time series across all null trials, and a constant regressor.We separately modeled null and stim trials, such that the resulting design matrix included trial-type-specific regressors for task-related and stimulusevoked impulses unique to each participant.The null trials were extracted and averaged across all runs separately for high and low reward to produce a null trial response template specific to different reward conditions.Stim trial response template was also computed separately for each reward condition, first subtracting the average null trial from stim trials, and then averaging the result across runs.Thus, each participant had four unique response templates reflecting stim and null trials for high and low reward runs.To enable comparisons between regression coefficients in different conditions, the response template regressor was normalized through z-scoring.This procedure ensures that the response template regressor has a common scale across all conditions, and the response amplitudes can be compared between reward levels (or spatial frequency and contrast levels for stimulus trials).We regressed each trial timeseries with the response template of the same reward condition and trial type to generate one response amplitude per trial.
We assessed overall model fit and compared the goodness of fit between trial types by computing the average of all R 2 values pooled across subjects under respective conditions.For every trial, we fit two free parameters to the linear regression model, the response amplitude and the constant regressor.
We evaluated the impact of reward on response amplitude by using two complementary measures: (1) b weight calculated from a linear regression and (2) the SD of pupil size across time on each individual trial.The impact of reward on response amplitudes were both analyzed using b weight and standard deviation measures.

Statistical comparisons
We assessed the paired data for normality by applying the Shapiro-Wilk test and visually examining the data distribution using histograms and QQ plots.To identify significant differences in response amplitudes across reward conditions and stimulus properties separately for each trial type, we employed both parametric and nonparametric tests.We performed statistical comparisons across participants, using each participant's mean response amplitude as an observation.
We conducted paired t tests to compare response amplitudes between high and low reward conditions for all participants at two contrast levels and five spatial frequency levels.We found minor deviations of the data from normality and we therefore repeated these comparisons using Wilcoxon signed-rank sum tests, a nonparametric statistical test that does not assume a normal distribution.The distribution of correctness difference between reward level and trial type (Extended Data Fig. 5-4) did not deviate from normal, however, for comparable statistical results, we analyzed behavioral performance using both parametric t tests and nonparametric Wilcoxon signed-rank sum tests.Behavioral performance (% correctness) was calculated excluding missed responses.
To evaluate the main effects of reward level, spatial frequency, contrast, and their interaction, on pupil size, we employed two-way ANOVAs and nonparametric Scheirer-Ray-Hare tests (SRH).We used a three-way ANOVA to validate results obtained from the two-way ANOVA and to establish the relationship between all variables and pupil size variation.Moreover, we confirmed that response amplitude accurately captures phasic changes in pupil size by re-running all statistical tests using the SD of each trial's time series.We repeated all statistical comparisons for each participant group tested under identical experimental conditions.
We wanted to ensure that the regression analysis for estimating response amplitude (i.e., subtracting the null-trial response template from stim trials and then reevaluating the response amplitude) did not skew the results, diminishing any true effect of reward on the stimulus-evoked response.We used bootstrapping to validate that this was not the case.For each participant, we extracted the pupil measurement on null trials, which consisted of task-related responses in the absence of stimulus-induced responses.These null trials were randomly split into two halves, namely "pseudo null" and "pseudo stim" trials, while keeping the reward labels intact.We then computed response templates for these newly reassigned trial groups separately for high and low reward conditions and calculated response amplitudes for each trial.The pseudo null response template was computed by averaging across pseudo null trials in all runs corresponding to high or low reward, and amplitudes were estimated by regressing pseudo null trials with the pseudo null response template.The pseudo null response template was then subtracted from all pseudo stim trials, which were then averaged to create a pseudo stim response template, and regressed with the pseudo stim trials to yield amplitude estimates.Low reward mean amplitude was subtracted from that of high reward and averaged across trials within the pseudo trial types.We repeated this procedure 10,000 times for each participant to generate a null distribution of the mean amplitude difference between high and low reward separately for pseudo stim and pseudo null trials.We then evaluated the actual amplitude difference against the generated null distribution.The p value is the fraction of the null distribution that exceeded or equaled the actual difference between high and low reward response amplitudes.

Code and data availability
The experimental datasets used for the current study will be available from the corresponding author on request.The code/software described in the paper is freely available online at https://github.com/elimerriam/pupilArousal.git.

Results
Participants performed an RSVP task at fixation, in which a sequence of numbers was presented at the center of the screen while unattended stimuli appeared in the periphery, followed by a 2-s response window (Fig. 1).Participants earned monetary reward based on their performance, according to one of two reward levels.The unattended gratings appeared at varying contrasts and spatial frequencies.

Stimuli evoke pupil responses unrelated to task or attention
Each run consisted of null trials and stimulus trials.On null trials, subjects performed the RSVP task, and no peripheral stimuli were presented.Because there was no peripheral stimulus, any modulation in pupil size on null trials could only be attributed to cognitive aspects of the task.On stimulus trials, unattended peripheral gratings appeared while participants performed the same RSVP task as on null trials.On each stimulus trial, gratings were presented at one of two different contrast levels (high and low contrast) and one of five different spatial frequency levels (0.5, 0.8, 1.3, 2.0, and 3.2 cpd).On stimulus trials, modulations in pupil size reflect both task-related and stimulus-evoked responses.We therefore isolated stimulus evoked responses and analyzed them separately from the task-evoked component.
We observed prominent pupil responses for both contrast levels and for all spatial frequencies.Response amplitude varied systematically with both stimulus contrast and spatial frequency (Fig. 2A-D).
It is well known that pupil size is sensitive to global changes in luminance (McDougal and Gamlin, 2015;May et al., 2019;Pan et al., 2022).Additionally, equiluminant stimuli evoke pupil responses that depend on contrast and spatial frequency (Barbur and Thomson, 1987;Young and Kennish, 1993;Hu et al., 2019).Our results show that pupil size is sensitive to both stimulus contrast and spatial frequency under fixed stimulus luminance, even when stimuli are not task relevant.These changes in pupil size were not a result of the pupillary light reflex, since the gratings were of the same mean luminance as the gray background.Nor were these responses likely to be caused by a shift in gaze to the grating stimuli, since subjects were performing a demanding task at fixation.In fact, participants maintained central fixation and did not tend to make saccades toward the peripheral stimuli.Gaze position during peripheral stimuli presentation showed small deviation within the central digit cue size of 0.55°(Extended Data Fig. 2-1) for both horizontal and vertical position.Saccades detected during the first 3 s followed main sequence, small in amplitude compared with stimulus eccentricity, and presented horizontal bias (Extended Data Fig. 2-2).

Response window 2s
Figure 1.Experiment design.Participants continuously fixated on a stream of numerical digits at the center of the screen while performing an RSVP task.On each trial, sinusoidal gratings of a specific contrast and spatial frequency level were briefly presented at four different orientations occupying the entire screen except for a 1.3°annulus at the screen center.Each stimulus was presented for 500 ms with 0-ms interstimulus interval.Participants indicated whether they saw exactly two 0s or more or less than two 0s after response cue, during the 2-s response window.No feedback was provided, except during the practice run.Participants maintained fixation throughout the 15-s trial.In each trial, participants could gain either a high or low monetary reward for correct performance.
Reward modulates task-related but not stimulusevoked responses On both stimulus trials (Fig. 3A) and null trials (Fig. 3B) we observed pupil responses entrained to trial timing.The task-related response on null trials was dilatory (Fig. 3B), while on stim trials the stimulus-evoked response was pupil constriction (Fig. 3C).
We evaluated the impact of reward on task-related and stimulus-evoked responses by quantifying their response amplitudes, which were obtained by regressing trial-wise responses against the mean response (Fig. 4B) and later averaged for statical comparisons (Fig. 4A).Additionally, we repeated the analysis with subject groups under different conditions such as increased contrast levels and a different monitor to (1) confirm the impact of reward on stimulus-evoked response  and to (2) ensure consistent results with linear luminance display output (Extended Data Fig. 5-1, 5-2, 6-1, 6-2).
A bootstrapping analysis indicated that the absence of an effect of reward on stimulus-evoked response was not an artifact of the regression analysis method (Fig. 5; see Materials and Methods, Statistical comparisons).We found that the mean of the null distribution for pseudo stim trials, which should have exhibited no effect, was indeed roughly zero.Furthermore, the stimulus-evoked response amplitude difference between reward conditions did not differ significantly from the null distribution (16.12% of the null distribution exceeded or equaled the actual response amplitude difference between reward conditions), indicating that the response template analysis did not skew the results, thereby validating the regression analysis (Fig. 5C; Extended Data Figs.5-1C, 5-2C, 5-3C; see Materials and Methods).Attention has been shown to modulate pupil size (Gabay et al., 2011;Wierda et al., 2012;Hu et al., 2019).In our experiment, attention was held constant throughout each trial because of the demanding RSVP task at fixation, controlling for effects of attention on pupil size.The changes we observed in pupil size are most likely because of increased arousal generated by higher reward levels, which in turn increased the amplitude of taskrelated responses.

Neither peripheral stimuli nor reward shifts attention during task
We hypothesized that pupil responses to the peripheral stimuli under different reward conditions were independent of subjects' ability to perform the task at fixation.We tested this hypothesis by comparing behavioral accuracy on the task across reward conditions.We used a Shapiro-Wilk normality test to evaluate the distributional assumptions of the behavioral data and found no evidence of non-normality in the difference in behavioral performance between high versus low reward conditions (W (40) ¼ 0.954, p ¼ 0.0969) and null versus stim trial types (W (40) ¼ 0.987, p ¼ 0.93).Reward did not impact behavioral performance on the RSVP task.There was no significant difference in % correctness between high reward and low reward conditions [high reward, 76.7% 6 12.9; low reward, 75.8 6 12.4 (mean 6 SD), average difference ¼ 0.932%, t suggests that the effect of reward is attributed to arousal, rather than attention, as reward increased arousal without affecting participants' allocation of attention to the fixation task, consistent with previous studies (Baruni et al., 2015;Roth et al., 2020).We found that behavioral accuracy was slightly higher during stim trials compared with null trials, and this difference was marginally significant, but only when using a parametric statistical test [stim trial, 77.0% 6 11.6; null trial, 73.85 6 15.9 (mean 6 SD), average difference ¼ 3.554%, t (40) ¼ 2.134, p ¼ 0.039, 95% CI [0.195 6.914], paired t test; Z ¼ 1.64, p ¼ 0.101, 95% BCa.CI [À1.839,6.441],Wilcoxon signed-rank test].However, we do not take this to mean that subjects' were shifting attention while performing the task on stim trials.Had subjects been attending the peripheral stimuli, we would expect accuracy on stim trials to be lower than null trials.We observed the opposite: behavioral accuracy was higher on stim trials.This trend was consistent across smaller subject groups tested with five contrast levels and on a different monitor, but the difference did not reach significance (Extended Data Figs.5-2B, 5-3B), indicating that attention levels were indeed constant.Therefore, we conclude that the stimulus-evoked pupil responses are not the result of attentional shifts away from fixation, supporting the conclusion that attention levels were maintained.

Spatial frequency and contrast interact
To better understand the relationship between rewardinduced arousal and stimulus features in their impact on stimulus-evoked responses, we tested whether the impact of stimulus properties on pupil response amplitude depended on reward.
We performed an additional three-way mixed ANOVA (spatial frequency Â contrast Â reward) to test whether the interaction between spatial frequency and contrast depended on reward level.The three-way ANOVA returned significant main effects of both stimulus properties (spatial frequency and contrast) on the stimulus-evoked response amplitude.Consistent with the two-way ANOVAs reported above, no interaction was found between stimulus properties and reward.Instead, we found that spatial frequency and contrast interact nonlinearly on stimulus-evoked response (F (4,3166) ¼ 68.64, p ¼ 1.16e-35), again consistent with the two-way ANOVA reported above.Collectively, the results indicate that the effect of stimulus properties on stimulus-evoked pupil response is independent of reward and arousal.These findings suggest that pupil effects of reward and stimulus properties rely on distinct neural pathways.

Summary
In this study, we identified two distinct pupil responses: on null trials pupils exhibited a dilatory task-related response, while on stimulus trials the response was a combination of a task-related response (dilation) and a stimulus-evoked response (constriction).This was despite the fact that subjects did not attend the stimuli.The stimulus-evoked response was modulated by stimulus properties, while the task-related response increased with reward.The effects of reward and stimulus properties on task-related and stimulus-evoked responses exhibited a double dissociation: reward modulated task-related but not stimulus-evoked responses, while stimulus properties modulated stimulus-evoked responses.Based on these findings we hypothesize that reward and stimulus properties affect pupil size through distinct neural pathways.

Underlying neural circuitry
While multiple neuromodulatory nuclei, including locus coeruleus (LC) and superior colliculus (SC) project onto EWN to control pupil size (Joshi et al., 2016;May et al., 2016;Joshi and Gold, 2020), based on the double dissociation of task-related and stimulus-evoked responses, it is possible that separate neural circuits underlie the two distinct pupil responses we identified.
Activity in LC is correlated with arousal and with pupil dilation (Aston-Jones and Cohen, 2005;Bradley et al., 2008;Carter et al., 2010;Murphy et al., 2011;Joshi et al., 2016;Reimer et al., 2016;de Gee et al., 2017;Wang et al., 2018;Breton-Provencher and Sur, 2019).It has been suggested that inhibitory projections from LC to preganglionic EWN drive pupil dilation in tandem with increased arousal and cognitive effort (Nieuwenhuis et al., 2011;Joshi et al., 2016;Liu et al., 2017).Conversely, the SC, which plays a role in the PLR, is sensitive to low-level stimulus properties.For example, neurons in macaque SC are tuned to spatial frequency, and increase their firing rate monotonically with frequency, within the range of frequencies we used here (C.Y.Chen et al., 2018).
Anatomical projections exist between the LC and SC pathways (Li and Wang, 2018;Li et al., 2018), and these could interact to change pupil diameter.However, our results suggest that the two circuits may function independently under certain conditions, as in the current study.We note that modulation by arousal and by stimulus properties may appear to interact under extreme conditions.For example, at high luminance conditions pupil size saturates (Y.Chen and Kardon, 2013;Reilly et al., 2019), and arousal is unlikely to further increase pupil size in a linear fashion, resulting in a sublinear summation of effects.In order for arousal and stimulus properties to impact pupil size linearly and independently, two conditions must cooccur: (1) pupil size must not be at its limits, and (2) arousal and stimulus property signals must not interact with each other.Our results imply that both these conditions hold in our experiment: pupil size remained within its dynamic range, and more importantly, distinct circuits facilitate task-evoked and stimulus evoked pupil responses.

Relationship to previous studies
Multiple studies have found that pupil size tracks arousal (Bradley et al., 2008;Carter et al., 2010;Reimer et al., 2016;Koelewijn et al., 2018;Slooten et al., 2018;Roth et al., 2020).Based on physiological findings, pupil dilations have been predominantly linked to arousal.Changes in pupil diameter covary with measures of peripheral autonomic activity, as well as spiking in regions such as LC (Aston-Jones and Cohen, 2005;Joshi et al., 2016;Reimer et al., 2016) and basal forebrain (Reimer et al., 2016) that generate widespread arousal-related neuromodulation (Joshi and Gold, 2020).Similarly, cognitive effort and reward modulate pupil response (Slooten et al., 2018).Specifically, pupil response during task performance is modulated by task difficulty, surprise, and behavior performance (Kahneman and Beatty, 1966;Barbur and Thomson, 1987;Aston-Jones and Cohen, 2005;Willems et al., 2015;Burlingham et al., 2022) variables that are known to scale with arousal levels.We and others have identified a task-related pupil response that is entrained to trial timing (Figs. 2 and 3), and likely reflects arousal (Roth et al., 2020;Burlingham et al., 2022).
Previous studies have also identified pupil responses to equiluminant stimuli, referred to as the reorienting response (Mathôt et al., 2014(Mathôt et al., , 2017;;Wang and Munoz, 2015;Ebitz and Moore, 2019) or the onset response (Binda and Gamlin, 2017).These responses are modulated by contrast and spatial frequency (Barbur and Thomson, 1987;Young and Kennish, 1993;Wang and Munoz, 2014;Wang et al., 2014;Hu et al., 2019) similar to the stimulusevoked response we report in the current study.It has been suggested that such responses may be a result of exogenous attention, since the sudden appearance of stimuli can grab attention (Yantis and Jonides, 1984;Barbur et al., 1992;Donk and van Zoest, 2008;Binda and Gamlin, 2017).Indeed, in many such studies measuring pupil size in response to stimuli attention is not controlled.In some studies, the same stimulus is both the target of attention, and the stimulus that modulates pupil size (Barbur and Thomson, 1987;Barbur et al., 1992;Barbur, 2004;Wierda et al., 2012;Hu et al., 2019).In other studies, subjects are instructed to fixate away from the stimulus, and it is therefore unclear whether or not the subjects are attending the stimulus or not (Gabay et al., 2011;Binda et al., 2013a;Mathôt et al., 2013;Naber and Nakayama, 2013;Hu et al., 2019).In both cases, it is difficult to determine whether stimulus effects on pupil size are dependent on attention to the stimulus.Previous reports of stimulus-evoked changes in pupil size are therefore confounded by attention.In our study, we control for attention by having subjects perform a demanding, continuous task at fixation, away from the peripheral stimuli.We found however that unattended stimuli evoked a pupil response, suggesting that the stimulusevoked response is not the result of a shift in attention.Instead, our findings point toward a stimulus-evoked pupil response independent of attention, that scales with stimulus visibility (Strang et al., 1999;Lee et al., 2014;Wang et al., 2014).
Key presses are known to induce pupil dilation (Richer and Beatty, 1985;Strauch et al., 2018), which could potentially contribute to the task-related responses observed in our study.We found no significant difference in reaction time between high and low reward conditions for null trials (reaction time: average difference ¼ À0.0217s, t (40) ¼ À1.21, p ¼ 0.234, 95% CI [À0.0580, 0.0146] paired t test; Z ¼ À0.706, p ¼ 0.480, 95% BC.a CI [À0.0222, 0.0196], Wilcoxon signed-rank test), despite task-related response amplitudes being significantly higher for higher reward.Furthermore, we used separate response templates for each trial type and reward condition.Thus, any effect from reaction times would be reflected in the response amplitude estimates.Therefore, we conclude that it is unlikely that differences in motor response characteristics (such key presses reaction time) between the conditions were driving the difference in task-related response amplitudes between reward conditions.

Implications for future research
Properly distinguishing between task-related and stimulus-evoked pupil responses is crucial for understanding the drivers of pupil size dynamics.For example, without removing the task-related response, reward may appear to impact the stimulus-evoked response (Fig. 3A), when in fact, our results suggest that reward affects only the taskrelated response (Fig. 3B).After analytically removing the task-related component, no effect of reward on the stimulus-evoked component is evident (Fig. 3C).Similarly, without distinguishing between stimulus-evoked and taskrelated responses, stimuli may appear to evoke biphasic responses (Fig. 3A): constriction followed by dilation and finally return to baseline (see, for example, Wang et al., 2014).Future research is necessary to investigate how additional cognitive processes interact with stimulusevoked responses.

Figure 2 .
Figure 2. Stimulus-evoked responses.A, Mean pupil size for high contrast, low contrast, and null trials.Pupil size exhibited a transient response entrained to task timing, constricting during peripheral stimuli presentation, and returning to baseline before the precue of the next trial.B, Stimulus-evoked pupil response for high-contrast and low-contrast trials.Stimulus-evoked responses were calculated by subtracting the mean null trial response from stimulus trials responses.C, Mean pupil size for each spatial frequency (0.5, 0.8, 1.3, 2.0, and 3.2 cpd) and for null trials.D, Stimulus-evoked pupil response for each spatial frequency.Light gray bar, peripheral stimulus presentation, darker gray bars, pre and response cues.Shaded regions, 6 SEM across participants.See also Extended Data Figure 2-1 for gaze position and Extended Data Figure 2-2 for saccades during trials.

Figure 3 .Figure 4 .
Figure 3. Mean pupil size for high and low reward on (A) stimulus trials and (B) null trials computed separately for low and high reward.C, Stimulus-evoked pupil response for high and low reward trials.Stimulus-evoked responses calculated by subtracting mean null trials from stimulus trials for each reward conditions.Light gray bar, peripheral stimulus presentation, darker gray bars, response cues.Shaded regions, 6SEM across participants.

Figure 5 .
Figure 5. Reward modulates task-related response.A, Average response amplitude for high (red) and low (blue) reward conditions in null and stim trials.Data points correspond to response amplitudes averaged across each condition, for individual participant.Participants with decreasing response amplitude and increasing response amplitude color coded as blue and red, respectively.B, Correctness (%) on each trial averaged across participants compared between reward level (left panel) and trial type (right panel).Points correspond to individual subjects' average correctness.Error bars are 61 SEM across participants; *p , 0.05; **p , 0.005; ***p , 0.001; ns p .0.05.Extended Data Figure 5-4 plots the distribution of correctness difference between reward level and trial types.C, Actual response amplitude difference between reward conditions (solid red line) compared against bootstrapped null distribution for stim and null trials.Trial labels for null trial response were permuted equally into pseudo null and pseudo stim trials within each reward condition.The black line denotes zero and the red line denotes the actual difference between high and low reward.See also Extended Data Figures5-1, 5-2, and 5-3 for the repeated analysis with subject groups under different conditions.

Table 1 :
Statistics tableFigure6.Reward and stimulus features are independent modulators of pupil response.A, Stimulus-evoked response amplitude for high and low reward compared across contrast and (B) spatial frequency.C, Response amplitude for high and low contrast across spatial frequency levels.Spatial frequency and contrast level interact to influence stimulus-evoked response.Data points correspond to mean response amplitude, and shaded regions correspond to 6 SEM across participants.See also Extended Figures6-1, 6-2, and 6-3 for the repeated analysis with subject groups under different conditions.